A Longitudinal Analysis of Search Engine Index Size

نویسندگان

  • Antal van den Bosch
  • Toine Bogers
  • Maurice de Kunder
چکیده

One of the determining factors of the quality of Web search engines is the size of their index. In addition to its influence on search result quality, the size of the indexed Web can also tell us something about which parts of the WWW are directly accessible to the everyday user. We propose a novel method of estimating the size of a Web search engine’s index by extrapolating from document frequencies of words observed in a large static corpus of Web pages. In addition, we provide a unique longitudinal perspective on the size of Google and Bing’s indexes over a nine-year period, from March 2006 until January 2015. We find that index size estimates of these two search engines tend to vary dramatically over time, with Google generally possessing a larger index than Bing. This result raises doubts about the reliability of previous one-off estimates of the size of the indexed Web. We find that much, if not all of this variability can be explained by changes in the indexing and ranking infrastructure of Google and Bing. This casts further doubt on whether Web search engines can be used reliably for cross-sectional webometric studies. Conference Topic Webometrics

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparing between the impacts of text based indexing and folksonomy on ranking of images search via Google search engine

Background and Aim: The purpose of this study was to compare the impact of text based indexing and folksonomy in image retrieval via Google search engine. Methods: This study used experimental method. The sample is 30 images extracted from the book “Gray anatomy”. The research was carried out in 4 stages; in the first stage, images were uploaded to an “Instagram” account so the images are tagge...

متن کامل

ارزیابی وب گاه دانشگاه علوم پزشکی تهران براساس معیارهای وب سنجی در سال 2008

Background And Aim: Nowadays university websites are very important in information services. There fore university has designed website for categorizing and availability of mass of information . This study accomplish to purpose evaluated of Tehran university of medicine sciences website base on webometrics criteria on 2008 . Materials and Methods: This survey have been used link analysis metho...

متن کامل

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...

متن کامل

WebSeer: An Image Search Engine for the World Wide Web

Because of the size of the World Wide Web and its inherent lack of structure, finding what one is looking for can be a challenge. In fact, some of the most highly visited Web sites are search engines. However, while Web pages typically contain both text and images, most currently available search engines only index text. This paper describes WebSeer, a system for locating images on the Web. Web...

متن کامل

Factors Related to the Age at Menarche in Iran: A Systematic Review and Meta-Analysis

Background Reduced age at menarche is an important health indicator for women and may be associated with complications such as an increased risk of asthma, breast cancer, ovarian cancer, type 2 diabetes. We aimed to examine the factors related to the age at menarche in Iran. Materials and Methods: In this systematic review and meta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015